Non-negative principal component analysis for NMR-based metabolomic data analysis
نویسندگان
چکیده
a r t i c l e i n f o Keywords: Non-negative principal component analysis Local feature representation k + 1 rule NMR-based metabolomics Multivariate data analysis Proton nuclear magnetic resonance (1 H-NMR) spectroscopy is one of the major analytical platforms used in metabolomics. The data acquired from NMR experiments are frequently processed using multivariate statistical methods such as principal component analysis (PCA) and partial least squares (PLS) to extract biologically meaningful information from complex spectra. Conventionally, these methods produce components with both positive and negative loadings, which contradict with the non-negativity of Fourier-transformed NMR spectra. In recent years, there is an increasing interest in incorporating non-negative constraints into multivariate methods. In the current study, a non-negative principal component analysis (NPCA) algorithm was introduced for the analysis of NMR-based metabolomic data. Using a simulated dataset, we showed that NPCA could reveal interesting local features in multivariate dataset, which are hidden in conventional PCA model. Notably, simulated peaks arising from a single compound were extracted by a same component in NPCA model. The current results also highlighted NPCA to be less susceptible to noise as compared to PCA. Furthermore, a supervised version of NPCA (sNPCA) was developed for class discrimination analysis, and it was used to identify urinary metabolites that distinguished hyperthyroid patients from healthy volunteers. Our results demonstrated that both NPCA and sNPCA could produce easily interpretable results and provide additional information to that of conventional projection methods. Proton nuclear magnetic resonance (1 H-NMR) spectroscopy and mass spectrometry (MS) constitute the leading profiling technologies in metabolomics. In general, MS-based methods offer a higher detection sensitivity and broader coverage of metabolome. The main advantage of using NMR is due to its nondestructive nature, which requires minimal sample preparation. Throughout the years, NMR-based metabolomics has found wide applications, for example in physiological evaluation [1], drug safety assessment [2,3], characterisation of animal models of disease [4], diagnosis of disease [5], and drug therapy monitoring [6]. NMR metabolomic data is characterised by very high dimensional-ities. However, it is generally accepted that only a small set of metab-olites are important for biological interpretation. Many feature selection algorithms have been used to reduce the dimensionality of metabolomic data, including principal component analysis (PCA) [7], partial least squares (PLS), independent component analysis (ICA) [8–10], nonnegative matrix factorisation (NMF) [11], as well as their variants [12–14]. Among these methods, PCA is regarded as the de facto standard …
منابع مشابه
HiRes - a tool for comprehensive assessment and interpretation of metabolomic data
UNLABELLED The increasing role of metabolomics in system biology is driving the development of tools for comprehensive analysis of high-resolution NMR spectral datasets. This task is quite challenging since unlike the datasets resulting from other 'omics', a substantial preprocessing of the data is needed to allow successful identification of spectral patterns associated with relevant biologica...
متن کامل1H NMR studies distinguish the water soluble metabolomic profiles of untransformed and RAS-transformed cells
Metabolomic profiling is an increasingly important method for identifying potential biomarkers in cancer cells with a view towards improved diagnosis and treatment. Nuclear magnetic resonance (NMR) provides a potentially noninvasive means to accurately characterize differences in the metabolomic profiles of cells. In this work, we use (1)H NMR to measure the metabolomic profiles of water solubl...
متن کاملNegative impact of noise on the principal component analysis of NMR data.
Principal component analysis (PCA) is routinely applied to the study of NMR based metabolomic data. PCA is used to simplify the examination of complex metabolite mixtures obtained from biological samples that may be composed of hundreds or thousands of chemical components. PCA is primarily used to identify relative changes in the concentration of metabolites to identify trends or characteristic...
متن کاملMetabolomic biomarkers in women with polycystic ovary syndrome: a pilot study.
The aim of this study was to investigate whether women with polycystic ovary syndrome (PCOS) had a unique metabolomic profile that was different from controls and to assess the feasibility of a definitive study. Twelve women with PCOS and 10 healthy women as controls had measurements of demographic and anthropometric data, venepunctures and assays on plasma samples for metabolomic profiles usi...
متن کاملNuclear magnetic resonance-based serum metabolic profiling of dairy cows with footrot
Footrot is a debilitating and contagious disease in dairy cows, caused by the Gram-negative anaerobe Dichelobacter nodosus. 1H-NMR (nuclear magnetic resonance)-based metabolomics has been previously used to understand the pathology and etiology of several diseases. The objective of this study was to characterize serum from dairy cows with footrot (n=10) using 1H-NMR-based metabolomics and chemo...
متن کامل